Skip to content

test(router): entity-splitter module POC — split _entities fetches and fan out in parallel#2775

Draft
jensneuse wants to merge 1 commit into
mainfrom
obvious-court
Draft

test(router): entity-splitter module POC — split _entities fetches and fan out in parallel#2775
jensneuse wants to merge 1 commit into
mainfrom
obvious-court

Conversation

@jensneuse
Copy link
Copy Markdown
Member

@jensneuse jensneuse commented Apr 20, 2026

Summary

Proof-of-concept custom router module that splits large _entities subgraph fetches into parallel sub-fetches and reassembles the response — demonstrating that request-level batching can be implemented entirely as a module, without engine changes.

  • Module lives in router-tests/modules/entity-splitter/ and is test-only.
  • Uses EnginePreOriginHandler.OnOriginRequest. When the incoming body exceeds SplitThreshold bytes and carries _entities representations, the module fans out BatchSize-sized chunks in parallel, merges results in order, and returns a synthetic *http.Response that short-circuits the real origin fetch.
  • astjson parser + arena pool for parsing incoming bodies; bytes are copied off-arena before release so they outlive pool reuse.
  • Forwards OTel trace context on sub-requests via otel.GetTextMapPropagator().Inject so each sub-fetch appears as a child span of the original subgraph request.
  • Order-preserving merge with null-fill + reindexed error paths for failed chunks (local _entities[i] rewritten to absolute _entities[absStart+i]).

Tests

go test ./router-tests/modules/ -run TestEntitySplitterModule -race — three subtests:

  • Split: 100 representations batched into 10 parallel sub-fetches of 10. Asserts exact client response with distinct name:"employee-<id>" per position, proving order preservation end-to-end.
  • Passthrough: 3 representations, request body below threshold, single sub-fetch, zero split count.
  • Error chunk: Hobbies subgraph returns 500 for any sub-fetch carrying ids 41–50. Asserts data.employees is exactly 100 positions in order with hobbies:null at ids 41–50 and distinct names elsewhere; engine surfaces one top-level subgraph-fetch error with the 10 per-position errors nested under extensions.errors.

Test plan

  • cd router-tests && go test ./modules/ -run TestEntitySplitterModule -race -v
  • Manual sanity check that the three subtests cover split / passthrough / error scenarios
  • Decide whether any of this pattern should graduate from test-only POC into a shipped module or engine primitive

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added entity splitting capability for large GraphQL entity fetch requests, improving handling efficiency by breaking operations into smaller concurrent batches while preserving response ordering.
  • Tests

    • Added comprehensive test coverage for entity splitting functionality, including scenarios for large batches, pass-through operations, and failure handling.

…es fan-out

Custom router module that intercepts outgoing _entities subgraph fetches via
EnginePreOriginHandler.OnOriginRequest, splits the representations array into
batches, fetches each batch in parallel, and merges the responses back into a
single synthetic *http.Response that short-circuits the real origin call.

Demonstrates feasibility of request-level batching without engine changes.

Implementation notes:
- astjson parser + arena pool for parsing incoming bodies
- simple body-size threshold for triggering splits
- OTel trace context propagation on fan-out sub-requests
- order-preserving merge with null-fill + reindexed error paths on failures

Integration tests (router-tests/modules/entity_splitter_test.go):
- 100 reps split into 10 parallel fetches of 10, exact-order assertion
- small request below threshold passes through unsplit
- one failed chunk: data carries nulls at the right positions and the engine
  surfaces the per-position errors under extensions.errors

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

Walkthrough

The changes introduce a new entity-splitter module for router-tests that intercepts large _entities requests, splits them into configurable batch sizes, processes chunks concurrently, and merges results. A corresponding test file validates the splitting logic across scenarios, and a go.mod dependency is promoted to direct.

Changes

Cohort / File(s) Summary
Dependency Management
router-tests/go.mod
Promoted github.com/wundergraph/go-arena v1.1.0 from indirect to direct dependency requirement.
Entity Splitter Module Implementation
router-tests/modules/entity-splitter/module.go
New EntitySplitterModule struct implementing core.EnginePreOriginHandler to intercept POST _entities requests. Parses GraphQL JSON body, splits representations array into configurable BatchSize chunks when exceeding SplitThreshold, performs concurrent HTTP POST sub-fetches with header/telemetry propagation, and merges responses into single 200 JSON with reindexed error paths. Includes metrics counters for sub-fetch and split request tracking.
Entity Splitter Tests
router-tests/modules/entity_splitter_test.go
Comprehensive test coverage for EntitySplitterModule with three scenarios: (1) large 100-representation fetch split into 10 parallel chunks with correct ordering, (2) small 3-representation fetch passing through unsplit, and (3) chunk failure handling with null insertion and GraphQL error surfacing. Includes request/response builders and HTTP middleware mocks for employee and hobbies subgraphs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: a new entity-splitter module that splits _entities fetches and executes them in parallel, which is the core functionality across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.4)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

Router-nonroot image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-b9bf7a65a46d016d48d98f015d22c4bedc1462e8-nonroot

@github-actions
Copy link
Copy Markdown

Router image scan passed

✅ No security vulnerabilities found in image:

ghcr.io/wundergraph/cosmo/router:sha-b9bf7a65a46d016d48d98f015d22c4bedc1462e8

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
router-tests/modules/entity-splitter/module.go (2)

291-299: Response body is read before status check, wasting I/O on errors.

The body is fully read (line 291) before checking status code (line 296). For non-200 responses, this reads potentially large error bodies unnecessarily.

♻️ Minor optimization
 	resp, err := m.Client.Do(subReq)
 	if err != nil {
 		out.fetchErr = fmt.Errorf("sub-fetch: %w", err)
 		return
 	}
 	defer resp.Body.Close()

+	if resp.StatusCode != http.StatusOK {
+		out.fetchErr = fmt.Errorf("sub-fetch returned status %d", resp.StatusCode)
+		return
+	}
+
 	bodyBytes, err := io.ReadAll(resp.Body)
 	if err != nil {
 		out.fetchErr = fmt.Errorf("read sub-response: %w", err)
 		return
 	}
-	if resp.StatusCode != http.StatusOK {
-		out.fetchErr = fmt.Errorf("sub-fetch returned status %d", resp.StatusCode)
-		return
-	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/modules/entity-splitter/module.go` around lines 291 - 299, The
code reads resp.Body into bodyBytes before checking resp.StatusCode, causing
unnecessary I/O for non-200 responses; change the logic in the fetch handling
(the block using resp, bodyBytes and out.fetchErr) to first inspect
resp.StatusCode and set out.fetchErr for non-OK statuses without reading the
full body, and only call io.ReadAll(resp.Body) when resp.StatusCode ==
http.StatusOK (optionally reading a small/truncated body for error logging if
needed) so you avoid reading large error payloads unnecessarily.

241-253: Prefer sync.WaitGroup.Go for cleaner goroutine management.

Go 1.25+ provides wg.Go(func()) which handles Add/Done automatically. This is the preferred pattern in this repository.

♻️ Proposed refactor
-	var wg sync.WaitGroup
-	wg.Add(len(chunks))
 	absStart := 0
+	var wg sync.WaitGroup
 	for i, chunk := range chunks {
 		results[i].absStart = absStart
 		results[i].chunkLen = len(chunk)
 		absStart += len(chunk)
-		go func(idx int, chunkReps [][]byte) {
-			defer wg.Done()
+		wg.Go(func() {
 			m.SubFetchCount.Add(1)
-			m.runSubFetch(req, pr, chunkReps, &results[idx])
-		}(i, chunk)
+			m.runSubFetch(req, pr, chunk, &results[i])
+		})
 	}
 	wg.Wait()

Based on learnings: "In Go code (Go 1.25+), prefer using sync.WaitGroup.Go(func()) to run a function in a new goroutine, letting the WaitGroup manage Add/Done automatically."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/modules/entity-splitter/module.go` around lines 241 - 253,
Replace the manual wg.Add(len(chunks)) + go func(...){ defer wg.Done() ... }
pattern with sync.WaitGroup.Go for each chunk: remove the wg.Add call, and
inside the loop call wg.Go(func() { m.SubFetchCount.Add(1); m.runSubFetch(req,
pr, chunkReps, &results[idx]) }) while ensuring you capture loop variables
safely (pass idx and chunk as parameters or assign to locals like i2, chunk2) so
m.runSubFetch and results[i] use the correct values; keep existing assignments
to results[i].absStart and results[i].chunkLen before launching the goroutine.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@router-tests/modules/entity-splitter/module.go`:
- Around line 71-79: The Module.New factory currently returns the receiver
instance m, causing all modules to share the same EntitySplitterModule (and its
atomic counters); change the New function returned in Module() to construct and
return a fresh instance (e.g., return &EntitySplitterModule{} or a new struct
populated from m's config fields) so each module gets its own counters; update
tests only if they deliberately relied on the shared instance for observability.
Ensure references to EntitySplitterModule, Module(), and ModuleInfo are used to
locate and modify the factory closure.

---

Nitpick comments:
In `@router-tests/modules/entity-splitter/module.go`:
- Around line 291-299: The code reads resp.Body into bodyBytes before checking
resp.StatusCode, causing unnecessary I/O for non-200 responses; change the logic
in the fetch handling (the block using resp, bodyBytes and out.fetchErr) to
first inspect resp.StatusCode and set out.fetchErr for non-OK statuses without
reading the full body, and only call io.ReadAll(resp.Body) when resp.StatusCode
== http.StatusOK (optionally reading a small/truncated body for error logging if
needed) so you avoid reading large error payloads unnecessarily.
- Around line 241-253: Replace the manual wg.Add(len(chunks)) + go func(...){
defer wg.Done() ... } pattern with sync.WaitGroup.Go for each chunk: remove the
wg.Add call, and inside the loop call wg.Go(func() { m.SubFetchCount.Add(1);
m.runSubFetch(req, pr, chunkReps, &results[idx]) }) while ensuring you capture
loop variables safely (pass idx and chunk as parameters or assign to locals like
i2, chunk2) so m.runSubFetch and results[i] use the correct values; keep
existing assignments to results[i].absStart and results[i].chunkLen before
launching the goroutine.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c6fdc012-9e02-45a3-896f-68fbac33edff

📥 Commits

Reviewing files that changed from the base of the PR and between d86216f and 6a25926.

📒 Files selected for processing (3)
  • router-tests/go.mod
  • router-tests/modules/entity-splitter/module.go
  • router-tests/modules/entity_splitter_test.go

Comment on lines +71 to +79
func (m *EntitySplitterModule) Module() core.ModuleInfo {
return core.ModuleInfo{
ID: moduleID,
Priority: 1,
New: func() core.Module {
return m
},
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Module.New returns m (the receiver) instead of a fresh instance.

The New factory function in ModuleInfo returns the same receiver instance m instead of creating a new instance. This means all modules will share the same EntitySplitterModule instance, which will cause race conditions on the atomic counters when tests run in parallel or multiple routers are instantiated.

🐛 Proposed fix
 func (m *EntitySplitterModule) Module() core.ModuleInfo {
 	return core.ModuleInfo{
 		ID:       moduleID,
 		Priority: 1,
 		New: func() core.Module {
-			return m
+			return &EntitySplitterModule{}
 		},
 	}
 }

Note: If the intent is to share the same instance for test observability (accessing SubFetchCount/SplitRequestCount), this is acceptable for test-only code but should be documented. The current tests pass the same splitter instance to WithCustomModules, so this works as intended for the POC.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@router-tests/modules/entity-splitter/module.go` around lines 71 - 79, The
Module.New factory currently returns the receiver instance m, causing all
modules to share the same EntitySplitterModule (and its atomic counters); change
the New function returned in Module() to construct and return a fresh instance
(e.g., return &EntitySplitterModule{} or a new struct populated from m's config
fields) so each module gets its own counters; update tests only if they
deliberately relied on the shared instance for observability. Ensure references
to EntitySplitterModule, Module(), and ModuleInfo are used to locate and modify
the factory closure.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 20, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 65.09%. Comparing base (13a70dc) to head (6a25926).
⚠️ Report is 96 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2775      +/-   ##
==========================================
+ Coverage   64.45%   65.09%   +0.63%     
==========================================
  Files         311      251      -60     
  Lines       44295    25887   -18408     
  Branches     4764        0    -4764     
==========================================
- Hits        28551    16851   -11700     
+ Misses      15721     7664    -8057     
- Partials       23     1372    +1349     

see 562 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added Stale and removed Stale labels May 5, 2026
@github-actions
Copy link
Copy Markdown

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions Bot added Stale and removed Stale labels May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant